Language models can be prompted to perform a wide variety of zero- and few-shot learning problems. However, performance varies significantly with the choice of prompt, and we do not yet understand why this happens or how to pick the best prompts. In this work, we analyze the factors that contribute to this variance and establish a new empirical hypothesis: the performance of a prompt is coupled with the extent to which the model is familiar with the language it contains. Over a wide range of tasks, we show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task. As a result, we devise a method for creating prompts: (1) automatically extend a small seed set of manually written prompts by paraphrasing using GPT3 and backtranslation and (2) choose the lowest perplexity prompts to get significant gains in performance.
translated by 谷歌翻译
在过去的十年中,需要执行复杂机器学习模型的计算资源的移动机器人数量一直在增加。通常,这些机器人依靠在无线通信上访问的边缘基础架构来执行重型计算复杂任务。但是,边缘可能会变得不可用,因此,迫使机器人执行任务。这项工作着重于通过减少预训练的计算机视觉模型的复杂性和参数总数来执行机器人上的任务。这是通过使用模型压缩技术(例如修剪和知识蒸馏)来实现的。这些压缩技术具有强大的理论和实用基础,但是在文献中并未广泛探索它们的合并用法。因此,这项工作尤其着重于研究结合这两种压缩技术的影响。这项工作的结果表明,可以删除计算机视觉模型参数总数的90%,而不会大幅度降低该模型的准确性。
translated by 谷歌翻译
It is desirable for detection and classification algorithms to generalize to unfamiliar environments, but suitable benchmarks for quantitatively studying this phenomenon are not yet available. We present a dataset designed to measure recognition generalization to novel environments. The images in our dataset are harvested from twenty camera traps deployed to monitor animal populations. Camera traps are fixed at one location, hence the background changes little across images; capture is triggered automatically, hence there is no human bias. The challenge is learning recognition in a handful of locations, and generalizing animal detection and classification to new locations where no training data is available. In our experiments state-of-the-art algorithms show excellent performance when tested at the same location where they were trained. However, we find that generalization to new locations is poor, especially for classification systems.
translated by 谷歌翻译